

**Amendments to the Drawings**

*See Attached Drawings*



FIG. 1



S  
y  
s  
t  
e  
m  
  
M  
e  
m  
o  
r  
y  
  
o  
r  
I  
/  
O  
  
D  
e  
v  
i  
c  
e

1  
9  
4



FIG. 2





FIG. 3





FIG. 4



**FIG. 5**

**Forwarding from RFC**

| Cycle | Address<br>(A) | Data<br>(D)  | Sec'y Data<br>(G) | Execution<br>(E) | Store<br>(S) | Write-back<br>(WB) | Store Buffers<br>(SB) |
|-------|----------------|--------------|-------------------|------------------|--------------|--------------------|-----------------------|
| 1     | st [MA1], R2   |              |                   |                  |              |                    |                       |
| 2     | nop            | st [MA1], R2 |                   |                  |              |                    |                       |
| 3     | ld R3, [MA1]   | nop          | st [MA1], R2      |                  |              |                    |                       |
| 4     |                | ld R3, [MA1] | nop               | st [MA1], R2     |              |                    |                       |
| 5     |                |              | ld R3, [MA1]      | nop              | st [MA1], R2 |                    |                       |
| 6     |                |              |                   | ld R3, [MA1]     | nop          | st [MA1], R2       |                       |
| 7     |                |              |                   |                  | ld R3, [MA1] | nop                | st [MA1], R2          |
| 8     |                |              |                   |                  |              | ld R3, [MA1]       |                       |

Note: The load/store instructions specify the same virtual load/store address.

Clock 4, E-stage: Result data destined for memory also written into RFC.

Clock 5, G-stage: Result data specified by load instruction forwarded from RFC.



**FIG. 6**

**Forwarding from Store Buffer**

| Cycle | Address<br>(A) | Data<br>(D)    | Sec'y Data<br>(G) | Execution<br>(E) | Store<br>(S)   | Write-back<br>(WB) | Stor Buff rs<br>(SB) |
|-------|----------------|----------------|-------------------|------------------|----------------|--------------------|----------------------|
| 1     | st [MA1], R2   |                |                   |                  |                |                    |                      |
| 2     | st [MA1], R3   | st [MA1], R2   |                   |                  |                |                    |                      |
| 3     | add R4, R5, R4 | st [MA1], R3   | st [MA1], R2      |                  |                |                    |                      |
| 4     | add R7, R8, R7 | add R4, R5, R4 | st [MA1], R3      | st [MA1], R2     |                |                    |                      |
| 5     | sub R4, R5, R4 | add R7, R8, R7 | add R4, R5, R4    | st [MA1], R3     | st [MA1], R2   |                    |                      |
| 6     | sub R7, R8, R7 | sub R4, R5, R4 | add R7, R8, R7    | add R4, R5, R4   | st [MA1], R3   | st [MA1], R2       |                      |
| 7     | ld R9, [MA1]   | sub R7, R8, R7 | sub R4, R5, R4    | add R7, R8, R7   | add R4, R5, R4 | st [MA1], R3       | st [MA1], R2         |
| 8     |                | ld R9, [MA1]   | sub R7, R8, R7    | sub R4, R5, R4   | add R7, R8, R7 | add R4, R5, R4     | st [MA1], R3         |
| 9     |                |                | ld R9, [MA1]      | sub R7, R8, R7   | sub R4, R5, R4 | add R7, R8, R7     |                      |
| 10    |                |                |                   | ld R9, [MA1]     | sub R7, R8, R7 | sub R4, R5, R4     |                      |
| 11    |                |                |                   |                  | ld R9, [MA1]   | sub R7, R8, R7     |                      |
| 12    |                |                |                   |                  |                | ld R9, [MA1]       |                      |

Note: The load/store instructions specify the same load/store address.

Clock 7, E-stage: Result data of first store instruction from R2 destined for memory written into first store buffer.

Clock 8, E-stage: Result data of second store instruction from R3 destined for memory written into second store buffer.

Clock 9, G-stage: Result data specified by load instruction forwarded from second store buffer.



**FIG. 7**

**Speculative Forwarding with Correction Due to Virtual Aliasing Condition**

| Cycle | Address<br>(A) | Data<br>(D)  | Sec'y Data<br>(G) | Execution<br>(E) | Store<br>(S) | Write-back<br>(WB) | Store Buffers<br>(SB) |
|-------|----------------|--------------|-------------------|------------------|--------------|--------------------|-----------------------|
| 1     | st [MA1], R2   |              |                   |                  |              |                    |                       |
| 2     | ld R4, [MA2]   | st [MA1], R2 |                   |                  |              |                    |                       |
| 3     |                | ld R4, [MA2] | st [MA1], R2      |                  |              |                    |                       |
| 4     |                |              | ld R4, [MA2]      | st [MA1], R2     |              |                    |                       |
| 5     |                |              |                   | ld R4, [MA2]     | st [MA1], R2 |                    |                       |
| 6     |                |              |                   |                  | ld R4, [MA2] | st [MA1], R2       |                       |
| 7     |                |              |                   |                  |              | ld R4, [MA2]       | st [MA1], R2          |
| n-1   |                |              |                   |                  |              |                    |                       |
| n     |                | ld R4, [MA2] |                   |                  |              |                    |                       |
| n+1   |                |              | ld R4, [MA2]      |                  |              |                    |                       |
| n+2   |                |              |                   | ld R4, [MA2]     |              |                    |                       |
| n+3   |                |              |                   |                  | ld R4, [MA2] |                    |                       |
| n+4   |                |              |                   |                  |              | ld R4, [MA2]       |                       |

**Note:** The load/store instructions specify different virtual load/store addresses that translate to the same physical address.

**Clock 5, E-stage:** Data from data unit speculatively forwarded to E-stage load instruction.

**Clocks 5 through n-1, E-stage:** Load instruction in stalled E-stage while store instruction data written to data cache.

**Clock 7, Store Buffers:** Result data of store instruction destined for memory written into store buffer.

**Clock n, D-stage:** Load instruction reissued within data unit.

**Clock n+1, G-stage:** Load instruction generates hit in data cache.



**FIG. 8**

**Speculative Forwarding with Correction Due to Access of Non-Cacheable Region**

| Cycle | Address<br>(A) | Data<br>(D)  | Sec'y Data<br>(G) | Execution<br>(E) | Store<br>(S) | Write-back<br>(WB) | Stor Buff rs<br>(SB) |
|-------|----------------|--------------|-------------------|------------------|--------------|--------------------|----------------------|
| 1     | st [MA1], R2   |              |                   |                  |              |                    |                      |
| 2     | nop            | st [MA1], R2 |                   |                  |              |                    |                      |
| 3     | ld R3, [MA1]   | nop          | st [MA1], R2      |                  |              |                    |                      |
| 4     |                | ld R3, [MA1] | nop               | st [MA1], R2     |              |                    |                      |
| 5     |                |              | ld R3, [MA1]      | nop              | st [MA1], R2 |                    |                      |
| 6     |                |              | ld R3, [MA1]      | ld R3, [MA1]     | nop          | st [MA1], R2       |                      |
| 7     |                |              |                   | ld R3, [MA1]     | nop          | st [MA1], R2       |                      |
| $n-1$ |                |              |                   | ld R3, [MA1]     |              |                    |                      |
| $n$   |                |              |                   | ld R3, [MA1]     |              |                    |                      |
| $n+1$ |                |              |                   | ld R3, [MA1]     |              |                    |                      |
| $n+2$ |                |              |                   | ld R3, [MA1]     |              |                    |                      |

Note: The load/store instructions specify the same virtual load/store address which translates to a physical address in a non-cacheable region.

Clock 6, E-stage: Storehit data from RFC speculatively forwarded to E-stage load instruction.

Clocks 6 through  $n-1$ , E-stage: Load instruction stalled in E-stage while data specified by load instruction is fetched from I/O device or system memory into a response buffer.

Clock  $n$ , E-stage: Data requested by load instruction selected from response buffer.

